Goto

Collaborating Authors

 Columbia



'Uncanny Valley': ICE's Secret Expansion Plans, Palantir Workers' Ethical Concerns, and AI Assistants

WIRED

In this episode of, our hosts dive into WIRED's scoop about a secret Trump administration campaign extending right into your backyard. This week, hosts Brian Barrett, Leah Feiger, and Zoë Schiffer discuss WIRED's big scoop on ICE's startling plans to expand to nearly every state in the US. Plus, a WIRED writer lets the viral AI assistant OpenClaw run his life for a week to give listeners a peek of what AI agents can and can't do. ICE Is Expanding Across the US at Breakneck Speed. Write to us at uncannyvalley@wired.com . You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link . I want to continue a conversation that we started yesterday in Slack after work hours for some of us. And this is about the men's short program-- But very specifically want to pick up on the conversation where Zoë had very strong feelings about the results of men's figure skating. I feel like we need to back up because you and Leah authentically care about the Olympics so much and I think just know more about sports than I do. I deeply have never engaged with sports ever, just as a whole rule, as a category. It doesn't exist in my life. Say the lines, say the lines, Zoë, or I'm going to read them verbatim from slack. Wait, I don't even know what you're talking about. I was merely surprised when I watched because the Americans went, I thought, wow, that guy basically fell over and was clumping around the ice, and then Japan went, and they were sailing around like little swans, and then when the gold medal came, it went to the Americans. I couldn't believe what had happened. No one else seemed outraged. For a little backup for our non-ice skating Olympic fans, I was always referring to Ilia Malinin, who a number of publications and sports experts say might actually be one of the greatest figure skaters of all time.


US prisons battle evolving drone technology used to smuggle contraband to inmates

FOX News

Drone detection technology has led to an increase in airborne smuggling operations over U.S. prisons, putting authorities in a tough spot as federal regulations prevent drones from being brought down.


Exploring Fusion Strategies for Multimodal Vision-Language Systems

Willis, Regan, Bakos, Jason

arXiv.org Artificial Intelligence

Modern machine learning models often combine multiple input streams of data to more accurately capture the information that informs their decisions. In multimodal machine learning, choosing the strategy for fusing data together requires careful consideration of the application's accuracy and latency requirements, as fusing the data at earlier or later stages in the model architecture can lead to performance changes in accuracy and latency. T o demonstrate this trade-off, we investigate different fusion strategies using a hybrid BERT and vision network framework that integrates image and text data. W e explore two different vision networks: MobileNetV2 and ViT. W e propose three models for each vision network, which fuse data at late, intermediate, and early stages in the architecture. W e evaluate the proposed models on the CMU-MOSI dataset and benchmark their latency on an NVIDIA Jetson Orin AGX. Our experimental results demonstrate that while late fusion yields the highest accuracy, early fusion offers the lowest inference latency. W e describe the three proposed model architectures and discuss the accuracy and latency trade-offs, concluding that data fusion earlier in the model architecture results in faster inference times at the cost of accuracy.


Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition

Shizhong Han, Zibo Meng, AHMED-SHEHAB KHAN, Yan Tong

Neural Information Processing Systems

Recognizing facial action units (AUs) from spontaneous fac ial expressions is still a challenging problem. Most recently, CNNs have shown promi se on facial AU recognition. However, the learned CNNs are often overfitted and do not generalize well to unseen subjects due to limited AU-coded traini ng images. W e proposed a novel Incremental Boosting CNN (IB-CNN) to integrat e boosting into the CNN via an incremental boosting layer that selects discr iminative neurons from the lower layer and is incrementally updated on success ive mini-batches. In addition, a novel loss function that accounts for errors fro m both the incremental boosted classifier and individual weak classifiers was pr oposed to fine-tune the IB-CNN. Experimental results on four benchmark AU datab ases have demonstrated that the IB-CNN yields significant improvement over the traditional CNN and the boosting CNN without incremental learning, as well a s outperforming the state-of-the-art CNN-based methods in AU recognition. The improvement is more impressive for the AUs that have the lowest frequencies in th e databases.


Biomedical Hypothesis Explainability with Graph-Based Context Retrieval

Tyagin, Ilya, Valipour, Saeideh, Sikirzhytskaya, Aliaksandra, Shtutman, Michael, Safro, Ilya

arXiv.org Artificial Intelligence

We introduce an explainability method for biomedical hypothesis generation systems, built on top of the novel Hypothesis Generation Context Retriever framework. Our approach combines semantic graph-based retrieval and relevant data-restrictive training to simulate real-world discovery constraints. Integrated with large language models (LLMs) via retrieval-augmented generation, the system explains hypotheses with contextual evidence using published scientific literature. We also propose a novel feedback loop approach, which iteratively identifies and corrects flawed parts of LLM-generated explanations, refining both the evidence paths and supporting context. We demonstrate the performance of our method with multiple large language models and evaluate the explanation and context retrieval quality through both expert-curated assessment and large-scale automated analysis.


Beyond the Uncanny Valley: A Mixed-Method Investigation of Anthropomorphism in Protective Responses to Robot Abuse

Yang, Fan, Li, Lingyao, Hu, Yaxin, Rodgers, Michael, Ma, Renkai

arXiv.org Artificial Intelligence

Robots with anthropomorphic features are increasingly shaping how humans perceive and morally engage with them. Our research investigates how different levels of anthropomorphism influence protective responses to robot abuse, extending the Computers as Social Actors (CASA) and uncanny valley theories into a moral domain. In an experiment, we invite 201 participants to view videos depicting abuse toward a robot with low (Spider), moderate (Two-Foot), or high (Humanoid) anthropomorphism. To provide a comprehensive analysis, we triangulate three modalities: self-report surveys measuring emotions and uncanniness, physiological data from automated facial expression analysis, and qualitative reflections. Findings indicate that protective responses are not linear. The moderately anthropomorphic Two-Foot robot, rated highest in eeriness and "spine-tingling" sensations consistent with the uncanny valley, elicited the strongest physiological anger expressions. Self-reported anger and guilt are significantly higher for both the Two-Foot and Humanoid robots compared to the Spider. Qualitative findings further reveal that as anthropomorphism increases, moral reasoning shifts from technical assessments of property damage to condemnation of the abuser's character, while governance proposals expand from property law to calls for quasi-animal rights and broader societal responsibility. These results suggest that the uncanny valley does not dampen moral concern but paradoxically heightens protective impulses, offering critical implications for robot design, policy, and future legal frameworks.


Adversarially-Aware Architecture Design for Robust Medical AI Systems

Gerhart, Alyssa, Iyangar, Balaji

arXiv.org Artificial Intelligence

Adversarial attacks pose a severe risk to AI systems used in healthcare, capable of misleading models into dangerous misclassifications that can delay treatments or cause misdiagnoses. These attacks, often imperceptible to human perception, threaten patient safety, particularly in underserved populations. Our study explores these vulnerabilities through empirical experimentation on a dermatological dataset, where adversarial methods significantly reduce classification accuracy. Through detailed threat modeling, experimental benchmarking, and model evaluation, we demonstrate both the severity of the threat and the partial success of defenses like adversarial training and distillation. Our results show that while defenses reduce attack success rates, they must be balanced against model performance on clean data. We conclude with a call for integrated technical, ethical, and policy-based approaches to build more resilient, equitable AI in healthcare.


Testing-driven Variable Selection in Bayesian Modal Regression

Duan, Jiasong, Zhang, Hongmei, Huang, Xianzheng

arXiv.org Machine Learning

We propose a Bayesian variable selection method in the framework of modal regression for heavy-tailed responses. An efficient expectation-maximization algorithm is employed to expedite parameter estimation. A test statistic is constructed to exploit the shape of the model error distribution to effectively separate informative covariates from unimportant ones. Through simulations, we demonstrate and evaluate the efficacy of the proposed method in identifying important covariates in the presence of non-Gaussian model errors. Finally, we apply the proposed method to analyze two datasets arising in genetic and epigenetic studies.


Can Small and Reasoning Large Language Models Score Journal Articles for Research Quality and Do Averaging and Few-shot Help?

Thelwall, Mike, Mohammadi, Ehsan

arXiv.org Artificial Intelligence

Assessing published academic journal articles is a common task for evaluations of departments and individuals. Whilst it is sometimes supported by citation data, Large Language Models (LLMs) may give more useful indications of article quality. Evidence of this capability exists for two of the largest LLM families, ChatGPT and Gemini, and the medium sized LLM Gemma3 27b, but it is unclear whether smaller LLMs and reasoning models have similar abilities. This is important because larger models may be slow and impractical in some situations, and reasoning models may perform differently. Four relevant questions are addressed with Gemma3 variants, Llama4 Scout, Qwen3, Magistral Small and DeepSeek R1, on a dataset of 2,780 medical, health and life science papers in 6 fields, with two different gold standards, one novel. The results suggest that smaller (open weights) and reasoning LLMs have similar performance to ChatGPT 4o-mini and Gemini 2.0 Flash, but that 1b parameters may often, and 4b sometimes, be too few. Moreover, averaging scores from multiple identical queries seems to be a universally successful strategy, and few-shot prompts (four examples) tended to help but the evidence was equivocal. Reasoning models did not have a clear advantage. Overall, the results show, for the first time, that smaller LLMs >4b, including reasoning models, have a substantial capability to score journal articles for research quality, especially if score averaging is used.